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We apply a replica inference based Potts model method to unsupervised image segmentation on 
multiple scales. This approach was inspired by the statistical mechanics problem of "community 
detection" and its phase diagram. Specifically, the problem is cast as identifying tightly bound 
clusters ( "communities" or "solutes" ) against a background or "solvent" . Within our multiresolution 
approach, we compute information theory based correlations among multiple solutions ("replicas") 
of the same graph over a range of resolutions. Significant multiresolution structures are identified by 
replica correlations as manifest in information theory overlaps. With the aid of these correlations as 
well as thermodynamic measures, the phase diagram of the corresponding Potts model is analyzed 
both at zero and finite temperatures. Optimal parameters corresponding to a sensible unsupervised 
segmentation correspond to the "easy phase" of the Potts model. Our algorithm is fast and shown 
to be at least as accurate as the best algorithms to date and to be especially suited to the detection 
of camouflaged images. 
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I. INTRODUCTION 

"Image segmentation" refers to the process of parti- 
tioning a digital image into multiple segments based on 
certain visual characteristics [H-Hj]. Image segmentation 
is typically used to locate objects and boundaries in im- 
ages. The result of image segmentation is a set of seg- 
ments that collectively cover the entire image or a set of 
extracted contours of the image. This problem is chal- 
lenging (see, e.g., Fig. (pQ)) and important in many fields. 
Examples of its omnipresent use include, amongst many 
others, medical imaging Q (e.g., locating tumors and 
anatomical structure), face recognition [5j, fingerprint 
recognition Q, and machine vision Q. Numerous algo- 
rithms and methods have been developed for image seg- 
mentation. These include thresholding Op, c lustering [9j , 
compression flOll a nd histogram based approaches, 
edge detection |l2j, region growing [l3j], split and merge 
[IBf , gradient flows and partial differential equation based 
approaches |l4l,ll6l . 
malized cuts [17 



2japh partitioning methods and nor- 
Markov random fields and mean 
field theorie s Il9l - l22| , watershed transformation [23j , ran- 
dom walks [24|7 isoperimetric methods [25| , neural net- 
works 0, and a variety of other approaches, e.g., p7l - 

In this work, we will apply a "community detection" al- 
gorithm to image segmentation. This method belongs to 
the graph partitioning category. Community detection 
[3(H33j seeks to identify groups of nodes densely con- 
nected within their own group ( "community" ) and more 
weakly connected to other groups. A solution enables 
the partition of a large physically interacting system into 
optimally decoupled communities. The image is then di- 
vided into different regions ("communities") based on a 
certain criterion, and each resulting region corresponds 
to an object in the original image. 

It is notable that by virtue of its graph theoretical na- 
ture, community detection is suited for the study of arbi- 
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FIG. 1: Examples of currently challenging problems in image 
segmentation. Left: The left image is that of zebra (courtesy 
of Ref . [34l] ) with the a similar "stripe" background. Right: 
The image on the right is that of a dalmatian dog [3H ]. Most 
people do not initially recognize the dog before given clues as 
to its presence. Once the dog is seen it is nearly impossible 
to perceive the image in a meaningless way. [35(] 



trary dimensional data. However, unlike general high di- 
mensional graphs, images are two (or three) dimensional. 
Thus, real images are far simpler than higher dimensional 
data sets as, e.g., evinced by the four color theorem stat- 
ing that four colors suffice to mark different neighboring 
regions in a segmentation of any two dimensional image. 
Thus, geometrical (and topological) constraints can be 
used to further improve the efficiency of the bare graph 
theoretical method. In jsfj [37} , in the context of an- 
alyzing structures of complex physical systems such as 
glasses, we used geometry dependent physical potentials 
to set the graph weights in various two and three dimen- 
sional systems. In the case of image segmentation, in the 
absence of underlying physics, we will invoke geometrical 
cut-off scales. 

In this work, we will discuss "unsupervised" image seg- 
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mentation. By this term, we allude to a general multi- 
purpose segmentation method based on a general phys- 
ical intuition. The current method does not take into 
account initial "training" of the algorithm- i.e., provide 
the system with known examples in which specific pat- 
terns are told to correspond to specific objects. We leave 
the study of supervised image segmentation and more 
sophisticated extensions of our inference procedure to a 
future work. One possible avenue which can be explored 
is the use of inference beyond that relating to different 
"replicas" in the simple form discussed in this manuscript 
that is built on prior knowledge (and prior probabilities 
in a Bayesian type analysis) of expected patterns in the 
studied images. 

We will, specifically, apply the multiresolution commu- 
nity detection method, first introduced in [38], to inves- 
tigate the overall structure at different resolutions in the 
test images. Similar to [38|, we will employ information 
based measures (e.g., the normalized mutual information 
and the variation of information) to determine the sig- 
nificant structures at which the "replicas" (independent 
solutions of the same community detection algorithm) 
are strongly correlated. With the aid of these informa- 
tion theory correlations, we illustrate how we may discern 
structures at different pertinent resolutions (or spatial 
scales). An image may be segmented at different levels 
of detail and scales by setting the resolution parameters 
to these pertinent values. We demonstrate in a detailed 
study of various test cases, how our method works in 
practice and the resulting high accuracy of our image 
segmentation method. 



II. OUTLINE 

The outline of our work is as follows. In Section Hill we 
introduce the Potts model representation of image seg- 
mentation and Potts model Hamiltonians that we will 
use. These Hamiltonians were earlier derived for graph 
theory applications. In Section HV1 we discuss how we 
represent images as graphs. In Section El we briefly de- 
fine the key concepts of trials and replicas which are of 
great importance in our approach. In Sec. lVI( we present 
our community detection algorithm. In Sec. IVII| we 
discuss the multiresolution method and the information 
based measures. In Section lVIIIl we illustrate how replica 
correlations may be used to set graph weights. For the 
benefit of the reader, we compile the list of parameters 
in Section [DD We discuss the computational complexity 
of our method in Section [Xj In Sec. IXI1 we provide in 
silico "experimental results" of our image segmentation 
method when applied to many different examples. These 
examples include, amongst others, the Berkeley image 
segmentation and the Microsoft Research Benchmarks. 
We conclude in Sec. IXIII with a summary of our results. 
Specific aspects are further detailed in the appendices. 



III. POTTS MODELS 

In what follows, we will briefly elaborate on our par- 
ticular Potts model representations of images and the 
corresponding Hamiltonians (energy or cost functions). 

A. Representation 

As is well appreciated, different objects in an image or 
more general communities in complex graph theoretical 
problems are ultimately denoted by a "Potts type" [39| 
variable c^. That is, if node i lies in a community number 
w then <Ji = w. If there are q communities in the graph 
then di can assume values 1 < o~i < q. A state {crg}^^ 
corresponds to a particular partition (or segmentation) of 
the system into q communities (or objects). In the con- 
text of image segmentation, Potts model representations 
can, e.g., also be found in |4Ch43|. 

B. Potts model Hamiltonian for unweighted graphs 

In [44|, a particular Potts model Hamiltonian was in- 
troduced for community detection. The ground states of 
this Hamiltonian (or lowest energy states) correspond to 
optimal partition of the nodes into communities. This 
Hamiltonian does not involve a comparison relative to 
random graphs ( u null models") [3l| and as such was free 
of the "resolution limit" problems [3l|, HH, |46j wherein 
the number of found communities or objects scaled with 
the system size in a way that was independent of the ac- 
tual system studied. In what follows below, there are N 
elementary nodes in a graph (or pixels in an image), we 
consider general unweighted graphs in which any pair of 
nodes may be either linked with a uniform weight or not 
linked at all. Specifically, a link between sites i and j is 
associated with edge weights Aij and Jij. In these un- 
weighted graphs, Aij is an element of the adjacency ma- 
trix. That is, = 1 if nodes i and j are connected by an 
edge and A^ = otherwise. The weights = (1 — A^). 

The goal of the general (or "absolute") Potts model 
Hamiltonian [44] was to energetically favor any pair of 
linked nodes to be in the same community, to penalize 
for a pair of unlinked nodes to be in the same community 
and conversely for nodes in different communities (penal- 
ize for having two linked nodes be in different communi- 
ties and favor disjoint nodes being in different communi- 
ties). Putting all of these bare energetic considerations 
together (sans any comparisons to random graphs), the 
resulting Potts model Hamiltonian (or e nerg y function) 
for a system of N nodes simplifies to 0, l49j 

nWs}^) = -\ Y^An - ^Jij)5(ai, <jj). (1) 

In Eq.(pQ), we emphasize the dependence of the Hamil- 
tonian on the N different variables {o~ s } at each lattice 
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site s (each of which can assume q values). In what fol- 
lows, the dependence of the Hamiltonian on {cr s }£Li will 
always be understood. 

The Kronecker delta 5(<Ji,<jj) = 1 if <Ji = Oj and 
5((Ti,(Tj) = if <Ji 7^ (Jj. In this Hamiltonian, by virtue 
of the S ai(7j term, each spin interacts only with other 
spins in its own community. As such, the resulting model 
is local- a feature that enables high accuracy along with 
rapid convergence 0. 

As noted above, minimizing this Hamiltonian cor- 
responds to identifying strongly connected clusters of 
nodes. The parameter 7 is the so called "resolution pa- 
rameter" which adjusts the relative weights of the linked 
and unlinked edges, as in Eq. (pQ). This is easily seen by 
inspecting Eq.(pQ). A high value of 7 leads to forbidding 
energy penalties unless all intra-community nodes "at- 
tract" one another and lie in the same community. By 
contrast, 7 = does not penalize the inclusion of any 
additional nodes in a given community and the lowest 
energy solution generally corresponds to the entire sys- 
tem. 



C. Potts Model Hamiltonian for Weighted Graphs 

In weighted graphs, we assign edges between nodes 
with the respective weights based on the interaction 
strength (e.g., the (dis-) similarity of the intensity or 
color defines the edge weight in image segmentation prob- 
lem). Specifically, in image segmentation problems, we 
determine (based on, e.g., color or intensity differences) 
the weight Vij between each member of a node pair. We 
then shift each such value by an amount set by a back- 
ground V, i.e., V-j = (Vij — V). The subtraction relative 
to the background of V allows for our community detec- 
tion algorithm to better partition the network of pixels. 
In principle, this background can be set to be spatially 
non-uniform. However, in this work we set V to be a 
constant. Thus, we generalize the earlier model of [38j in 
Eq. (pQ) by the inclusion of a background V and by allow- 
ing for continuous weights Vij instead of discrete weights 
that are prevalent in graph theory. The resulting Hamil- 
tonian [3^, [13] reads 



In Eq. (j2j) , the number of communities q may be spec- 
ified from the input or left arbitrary and have the algo- 
rithm decide by steadily increasing the number of com- 
munities q for which we have low energy solutions. The 
Heavyside functions Q(x) "turns on" or "off" the edge 
designation [Q(x > 0) = 1 and S(x < 0) = 0] relative to 
the aforementioned background V. As before, minimiz- 
ing the Hamiltonian of Eq. (|2]) corresponds to identifying 
strongly connected clusters of nodes. 

While in Eq. JU (or Eq. [2]),) the input concerns two- 
point (p = 2) edge weights Vij (or Aij) , it is, of course, 
possible to extend these Hamiltonian to allow for more 
general motifs (such as p = 3 node triangles) and include 
P > 3 point weights Vijk (and extensions thereof). These 
correspond to p spin interactions. In the current study, 
however, we limit ourselves to p = 2 node weights. 



IV. CASTING IMAGES AS NETWORKS 

We will now detail how we translate images into net- 
works with general edge weights that appear in Eqs.([TJ 
[2]). We will represent pixels as the nodes in a graph. Edge 
weights define the (dis-) similarity between the neighbor- 
hood pixels. 

Images may be broadly divided into two types: (a) 
those with the uniform and (b) those with varying inten- 
sity. "Uniform intensity" means that the entire object or 
each component is colored by one intensity or color. By 
the designation of "varying intensity", we allude to ob- 
jects or components that exhibit alternating intensities 
or colors, e.g., the stripes and spots seen in Fig. (pQ). 

Regarding the above two types of images, two different 
methods may be employed to define the edge weights: (i) 
The intensity /color difference between nodes is defined as 
the edge weight in images with uniform intensity, (ii) The 
overlap between discrete Fourier transforms of blocks is 
defined as the edge weight in images with varying inten- 
sity. The second method is designed to distinguish the 
target and the background by their specific frequencies. 
We will detail both methods below in Sec. IIV Al (where 
we discuss images with uniform intensities) and Sec. IIV Bl 
(spatially varying intensities). 



q A. Edge definition for images with uniform 

H=- J2(Vij - V) \&(V - V^) + 7 6(% - V)] 8{a t , a j ).{2) intensity 



The form of this Hamiltonian and that of Eq. (pQ) was 
inspired by positive and negative energy terms that fa- 
vor the formation of tightly bound clusters (or "solutes" ) 
that are more weakly coupled to their surroundings [49| . 
Similar to the important effects of the solute found in 
physical systems [50j , the Hamiltonian of Eq. J2j captures 
all interactions in the system [49] . Earlier (36|, [33 ) , we 
invoked the Hamiltonian of Eq. (j2]) to analyze static and 
dynamic structures in glasses. 



For images of uniform intensity, we will define edges 
based on the color (dis-) similarity. For the unweighted 
Potts model of Eq. (pQ), we will assign an edge between 
two pixels (i and j ) if the "color" difference (D^) be- 
tween them is less than some threshold (V). That is, 



0(V - Dij). 



(3) 



For weighted Potts model in Eq. (j2j), we will, as we 
will elaborate on momentarily, set the weights Vij to be 



the "color" difference (D{j) between nodes i and j, i.e., 

V ij =D ij . (4) 

As seen from the energy functions of Eqs. (jTJ [2j), a large 
dis-similarity Vij favors nodes i and j being in different 
clusters. 

A grey scale image is an image that in which the value 
of each pixel carries only intensity information. Images 
of this sort are composed exclusively of shades of gray, 
varying from black at the weakest intensity (/ = 0) to 
white at the strongest (/ = 255). For a grey-scale image, 
the "color" difference is the absolute value of the intensity 
difference, i.e., 

A, U ii • (5) 

A "color image" is an image that includes color infor- 
mation for each pixel. Each pixel contains three color 
components: red, green and blue (or RGB). The value of 
the intensity of each of these three components may at- 
tain any of 2 8 values (any integer in the interval [0, 255]). 
For a color-scale image, we define the "color" difference 
as the average of the differences between the color com- 
ponents red, green and blue. That is, with i?i,Gi, and 
Bi respectively denoting the strengths of the red, green, 
and blue color components at site z, we set 

Da = ±(\Ri -Rj\ + \Gi -Gj\ + \Bi - Bj\). (6) 

We do not store edges between every pair of nodes. 
Rather, edges connect nodes whose distance is less than 
a tunable value A. 
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FIG. 2: [Color Online.] An example of overlapping blocks. 
The block size is L x x L y = 5 x 5. The nearest neighbor 
of the block enclosed by "purple" in x-direction is the one 
enclosed by "red", and its nearest neighbor in y-direction is 
the one enclosed by "yellow" . They are connected due to the 
nearest neighbor condition. 

about each (of the N) pixels and are of size L x x Ly. 
The dimensions of the individual blocks are, generally, 
far smaller than that of the entire system, L XjV <C N XiV . 
General scales can be gleaned from a Fourier transform 
of the entire image. 

To construct the connection matrix between the 
blocks, we connect edges between each pair of blocks 
and set the distance between the nearest block pair to 
be 1. This choice has the benefit of overlapping the 
nearest neighbor blocks, which share more commons. 
Fig. [2] gives a schematic plot of the "overlapping block" 
structure. 



B. Edge definition for images with varying 
intensities 

Typically, images with varying intensities contain dif- 
ferent patterns. To separate these patterns, we con- 
struct a "block-structure" containing the quintessential 
pattern information. We next introduce a method to di- 
vide blocks and then elaborate on two different ways to 
connect edges between blocks. 

General contending pertinent scales may be deter- 
mined by, e.g., examining the peaks of the Fourier 
transform of an entire image (whose location yields 
the inverse wave-length and whose width reveals the 
corresponding spatial extent of these modulations). 
While such simple transforms may aid optimization in 
determining candidate parameter scales, our algorithm 
goes far beyond such simple global measures. 



1. Overlapping blocks 

We will divide an entire image of size N = N x x N y 
into N overlapping blocks. These blocks are centered 



2. Average intensity difference between blocks 

Following the construction of the overlap blocks struc- 
ture, we next compute the average intensity of each block 
and connect the edges between blocks based on the dif- 
ference. In this case, each block can be treated as a 
"super-node" which contains the pattern information of 
the studied image. 

To further incorporate geometrical scales, we multiply 
the edge weights by exp(— |r m — r n \/£) (where £ is a tun- 
able length scale and the vectors r m and r n denote the 
spatial locations of points m and n). We remind the 
reader that there are N basic blocks and thus N possible 
values of m (and N possible values of n). We will set in 
Eq. (|2j), the weights V mn between block m and n to be 

T/ _ £> mn exp(-|r m -r n |/l) 

Vmn — T T , {() 

where 

D mn = (l-S(m,n))\ E (I m (i,j)-I n (i,M-(S) 

i=0 j=0 
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As seen in Eq. J5J), D mn is the sum of the absolute 
values of the intensity differences between blocks m and 
n with each of these blocks being of size L x x L y . In 
Eq. ©, | r m — r n I is the physical distance between block 
m and n (i.e., the distance between the central nodes of 
each block). 

The geometrical factor of (exp(— |r m — r n \/£)) in 
Eq. (O with a tunable length scale £ can be set to 
prefer (and, as we will illustrate also to detect) cer- 
tain scales in the image. This enables the algorithm to 
detect clusters of varying sizes that contain rich textures. 



3. Fourier amplitude derived weights 

As it is applied to image segmentation, the utility of 
Fourier transformations is well appreciated. We next dis- 
cuss how to invoke these in our Potts model Hamiltonian. 
To highlight their well known and obvious use, we note 
that, e.g., the stripes of the zebra in Fig. [1] contain wave- 
vectors which are different from those of the more uni- 
formly modulated background. Thus, a spatially local 
Fourier transform of this image may distinguish the ze- 
bra from the background. We will now invoke Fourier 
transforms in a general way in order to determine the 
edge weights in our network representation of the image. 

With the preliminaries of setting up the block struc- 
ture in tow, we now apply a discrete Fourier transform 
inside each block. Rather explicitly, excluding the spatial 
origin, the local discrete 2 — D Fourier transform of a gen- 
eral quantity f m within block m with internal Cartesian 
coordinates (a, b) is 

l x -i L y -i 

F m (k,l)= £ Yl /mK%" Mfe+ ^ ) -/ m (0,0).(9) 

a=0 6=0 

The wave- vector components k = 0, 1, L y — 1 and 
I = 0, 1, L x — 1. In applications, we set, for grey-scale 
images, / m (a, b) to be the intensity / at site (a, b) in block 
m (a whose location relative to the origin of the entire 
image we denote by r a6;m ). That is, f m (a, b) = J(r a6;m ). 
In color images, we set / to be the average over the in- 
tensity of the red, green and blue components: /(a, b) = 
|(i?(a, b) + G(a, b) + £?(a, b)). We fix the couplings J mn 
between blocks m and n to be 
l x -i Ly-i 

J mn =J2 E \K(k,l)F n (k,l)\. (10) 

k=0 1=0 

We connect blocks whose spatial separation is less than 
the aforementioned tunable cutoff distance A by links 
having edge weights V mn . In practice, we fixed A. With 
Eq. ([TO]) in hand, we set 

Vmn = (S(m,n) - l)J mn exp(-|r m - r n \/£) (11) 

in Eq. (j2j). In this case, the background V would be 
negative. 



When inverting the sign of the left hand side of Eq. (fTTj) 
(shown in Appendix C), our algorithm will be also suited 
for the detection of changing objects against a more uni- 
form background. 

We now briefly comment on the relation between the 
Fourier space overlaps and weights in Eqs. (jlOJlip and the 
real space overlaps and weights in Eqs. (|7)8j) . It is notable 
that in Eq. ([T0]h we sum over the modulus of the products 
of the Fourier amplitudes. By Parseval's theorem, sans 
the modulus in the summation in Eq. ([T0]) . J mn would be 
identical to the overlap in real space between f m and f n . 
Such real space overlaps directly relate to the real-space 
overlaps in Eq.flEJ) [following a replacement of the abso- 
lute value in Eq.(|8j) by its square and an overall innocuous 
multiplicative scale factor]. Thus, without the modulus 
in Eq. (flQ|) . the Fourier space calculation outlined above 
affords no benefit over its real space counter-part. Phys- 
ically, the removal of the phase factors when perform- 
ing the summation in Eq. (fT0|) avoids knowledge of the 
relative location of the origins between different blocks. 
This allows different regions of a periodic pattern to be 
strongly correlated and clumped together. By contrast, 
for a periodic wave of a particular wave- vector, the real 
space overlaps between blocks m and n may vanish when 
the origins of blocks m and n are displaced by a real 
space distance that is equal to half of the wave-length of 
the periodic wave along the modulation direction. Thus, 
the real space weights as derived from Eqs. (|7|8| ) may van- 
ish when the corresponding Fourier space derived weights 
(Eqs. (|10Jll| )) are sizable. 

It is possible to improve on the simple Fourier space 
derived weights by a general wavelet analysis. 



V. DEFINITIONS: TRIALS AND REPLICAS 

In the following sections, we will discuss our specific 
algorithms for (i) community detection and (ii) multi- 
scale community detection. Before giving the specifics 
of our algorithms, we wish to introduce two concepts on 
which our algorithms are based. Both pertain to the use 
of multiple identical copies of the same system (image) 
which differ from one another by a permutation of the 
site indices. Thus, whenever the time evolution may de- 
pend on sequentially ordered searches for energy lowering 
moves (as it will in our greedy algorithm), these copies 
may generally reach different final candidate solutions. 
By the use of an ensemble of such identical copies, we 
can attain accurate result as well as determine informa- 
tion theory correlations between candidate solutions and 
infer from these a detailed picture of the system. 

In the definitions of "trials" and "replicas" given below, 
we build on the existence of a given algorithm (any algo- 
rithm) that may minimize a given energy or cost function. 
In our particular case, we minimize the Hamiltonian of 
Eqs. ((HE). 

• Trials. We use trials alone in our bare community 
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detection algorithm [H, Q • We run the algorithm on the 
same problem t independent times. This may generally 
lead to different contending states that minimize Eqs. ((TJ 
Ej). Out of these t trials, we will pick the lowest energy 
state and use that state as the solution. 

• Replicas. We use both trials and replicas in our multi- 
scale community detection algorithm [44[ . Each sequence 
of the above described t trials is termed a replica. When 
using "replicas" in the current context, we run the afore- 
mentioned t trials (and pick the lowest solution) r inde- 
pendent times. By examining information theory corre- 
lations between the r replicas we infer which features of 
the contending solutions are well agreed on (and thus are 
likely to be correct) and on which features there is a large 
variance between the disparate contending solutions that 
may generally mark important physical boundaries. We 
will compute the information theory correlations within 
the ensemble of r replicas. Specifically, information the- 
ory extrema as a function of the scale parameters, gener- 
ally correspond to more pertinent solutions that are lo- 
cally stable to a continuous change of scale. It is in this 
way that we will detect the important physical scales in 
the system. 

These definitions might seem fairly abstract for the 
moment. We will flesh these out and re- iter ate their def- 
inition anew when detailing our specific algorithms to 
which we turn next. 



VI. THE COMMUNITY DETECTION 
ALGORITHM 

Our community detection alg orithm for minimizing 
Eqs. dU [2]) follows four steps [44f . 

(1) We partition the nodes based on a "symmetric" or 
"fixed q" initialization (q is the number of community). 

• "Symmetric" initialization alludes to an initialization 
wherein each node forms its own community (and thus, 
initially, there are q — N communities). 

• "Fixed q" initialization corresponds to a random ini- 
tial distribution of all nodes into q communities. 

For the application of image segmentation, "symmet- 
ric" initialization is used for the "unsupervised" case. In 
this case, the algorithm does not know what to look for, 
thus the "symmetric" initialization provides the advan- 
tage of no bias towards a particular community. The al- 
gorithm will decide the number of community q by merg- 
ing nodes for which we have lower energy solution. 

"Fixed q" initialization may be used in a "supervised" 
image segmentation. The community membership of in- 
dividual node will be changed to lower the solution en- 
ergy. One has to decide how much information is needed 
by observing the original image and enter the number 



of communities q as an input. Different levels of infor- 
mation correspond to different number of communities 
q. For instance, if only one target needs to be identified, 
q = 2 is enough. The q communities include the target 
and background. 

In the following sections, we will use the "unsuper- 
vised" image segmentation and let the algorithm decide 
the community number q. 

(2) Next, we sequentially "pick up" each node and 
place it in the community that best lowers the energy of 
Eqs. ([TJ [2]) based on the current state of the system. 

(3) We repeat this process for all nodes and continue 
iterating until no energy lowering moves are found after 
one full cycle through all nodes. 

(4) We repeat these processes "t" times (trials) and 
select the lowest energy result as the best solution. 



VII. MULTI-SCALE NETWORKS 

After determining for the adjacency matrix in Sec. 
IIV Al and Sec. IIVBI we now turn to the-so called "reso- 
lution parameter" (7) in Eq. JD/Eq. (j2]). In [38|, we in- 
troduced the multiresolution algorithm to select the best 
resolution. Our multi-scale community detection was in- 
spired by the use of overlap between replicas in spin-glass 
physics. In the current context, we employ information- 
theory measures, to examine contending partitions for 
each system scale. Decreasing 7, the minima of Eqs. (jTJ 
[2]) lead to solutions progressively lower intra-community 
edge densities, effectively "zooming out" toward larger 
structures. We determine all natural graph scales by 
identifying the values of 7 for which the earlier men- 
tioned "replicas" exhibit extrema in the average of infor- 
mation theory overlaps such as the normalized mutual 
information (In) an d the variation of information (V) 
when expressed as functions of 7, i. The extrema and 
plateau of the average information theory overlaps as a 
function of 7, t over all replica pairs indicate the natu- 
ral network scales [38| . The replicas can be chosen to be 
identical copies of the same system for the detection of 
static structures, e.g., the image segmentation. 

We will briefly introduce the information theory mea- 
sures in the following section. 

A. Information theory measures 

The normalized mutual information In and the varia- 
tion of information V are the accuracy parameters which 
are employed to calculate the similarity (or overlap) be- 
tween replicas. 

We begin with a list of definitions of the information 
theory overlaps as they pertain to community detection. 
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• Shannon Entropy: If there are q communities in a 
partition A, then the Shannon entropy is 



q 

n, 



a n 

^ N l ° g 

a=l 



na 
2 N ' 



(12) 



where ^ is the probability for a randomly selected 
node to be in a community a, n a is the number of nodes 
in community a and TV is the total number of nodes. 

• Mutual Information: 

The mutual information I(A, B) between partitions 
found by two replicas A and B is 



a=l 6=1 



TV 



n ab N 
n a n b ' 



(13) 



where n a b is the number of nodes of community a of 
partition A that are shared with community b of partition 
P, qA (or qs) is the number of communities in partition 
A (or B) and n a (or 77,5) is defined the same as before, 
i.e., the number of nodes in community a (or community 

b). 

• Variation of information: 

The variation of information (0 < V(A, B) < log 2 TV) 
between two partitions A and B is given by 



V(A,B)=H A + H B -2I(A,B). 



(14) 



• Normalized Mutual Information: 

The normalized mutual information < In (A, B) < 1 



Jjv(A,B) 



2I(A,B) 
Ha + H b ' 



(15) 



Now, here is a key idea employed in [38| which will be 
of great use in our image segmentation analysis: when 
taken over an entire ensemble of replicas, the average In 
or V indicates how strongly a given structure dominates 
the energy landscape. High values of In (or low values of 
V) corresponds to more dominate and thus more signifi- 
cant structure. From a local point of view, at resolutions 
where the system has well-defined structure, a set of in- 
dependent replicas should be highly correlated because 
the individual nodes have strongly preferred community 
memberships. Conversely, for resolutions "in-between" 
two strongly defined configurations, one might expect 
that independent replicas will be less correlated due to 
"mixing" between competing divisions of the graph. 



B. The application of the multiresolution 
algorithm for a hierarchal network example 

We will shortly illustrate how the multiresolution al- 
gorithm [38| works in practice by presenting an example 
of the multiresolution algorithm as it is applied to a hi- 
erarchal test system of TV = 1024 nodes. 

To begin the multiresolution algorithm, we need to 
specify the number of replicas r at each test resolu- 
tion, the number of trials per replica t, and the start- 
ing and ending resolution [70,7/]- Usually, the num- 
ber of replicas is 8 < r < 12, the number of trials is 
2 < t < 20. As detailed in Section El we select the low- 
est energy solution among the t trials for each replica. 
The initial states within each of the replicas and tri- 
als are generated by reordering the node labels in the 
"symmetric" initialized state of one node per community. 
These permutations P simply reorder the node numbers 
(1, 2, 3, z, TV) (PI, P2, PTV) (with Pi the im- 
age of i under a permutation) and thus lead to a different 
initial state. 

(1) The algorithm starts from the initialization of the 
system described in item (1) of Section IVTl 

(2) We then minimize Eq. (pQ) independently for all 
replicas at a resolution 7 = 7i G [70, 7/] as described in 
Section EH Initially i = (i.e., 7 = 70). 

(3) The algorithm then calculates the average inter- 
replica information measures like In and V at that value 
of 7. 

(4) The algorithm then proceeds to the next resolution 
point 7 i+ i G (70,7/] ( witn 7i+i > 7*)- 

(5) We then return to step number (3). 

(6) After examining the case of 7 = 7/, the algorithm 
outputs the inter-replica information theory overlaps for 
entire the range of the resolutions studied (i.e., 7 on the 
interval [70, 7/])- 

(7) We examine those values of 7 corresponding to 
extrema in the average inter-replica information theory 
overlaps. Physically, for these values the resulting image 
segmentation is locally insensitive to the change of scale 
(i.e., the change in 7) and generally highlights prominent 
features of the image. 

With A and B denoting graph partitions in two dif- 
ferent replicas and Q(A, B) their overlap, these average 
inter-replica overlaps for a general quantity Q [38| are 
explicitly 



(Q) = 



1 



r(r — 1) 



(16) 



Similarly, for a single replica quantity (such as the Shan- 
non entropy H for partitions A in different replicas), 
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the average is, trivially, (Q) = ^ A H(A)/r. (Averages 
for higher order inter-replica cumulants may be similarly 
written down with a replica symmetric probability dis- 
tribution function [38| .) 

Fig.2]shows the result of multiresolution algorithm ap- 
plied to the three-level hierarchy system in Fig. [3] The 
system investigated is that of a standard simple graph 
with unweighted links (i.e., in Eq.(pQ), Aij = 1 if nodes 
i and j share an edge and is zero otherwise). In Fig. [3j 
"level 3" communities exhibit a density of links ps = 0.9 
(i.e., a fraction ps of the intra-community node pairs are 
connected by a link (Aij = 1)). The individual com- 
munities in level 3 have sizes that range between 5 to 
24 nodes. The less dense communities in level 2 har- 
bor a density of links p2 = 0.3; the nodes in this case, 
are divided into five groups with sizes that vary from 26 
to 95. Highest up in the hierarchy is the trivial level 1 
"partition"- that of a completely merged system of 1024 
nodes. Thus, as a function of 7, this easily solvable sys- 
tem exhibits "transitions" between different stable solu- 
tions corresponding to different regions (or basins) of 7. 
In Sections (p0 EED}, we will further discuss additional 
transitions between easy solvable regions and regions of 
parameter space which are very "hard" or impossible 
( "unsolvable" ) . 

Fig. [U(a) depicts the averages of In (on the left axis) 
and / (right axis) over all replica pairs. (We further 
provide in this figure the number of communities q.) A 
"cascade" composed of three plateaus is evident in these 
information theory measures. Similarly, Fig. |4jb) shows 
the V in left axis and H in right axis average over all 
replica pairs. The extrema denoted by the arrows in 
both panel (a) and (b) are the correctly identified lev- 
els 2 and 3 respectively of the hierarchy depicted in Fig. 
03 The two plateaus with the peak values in panel (a) 
correspond to a normalized mutual information of size 
In = 1 (the highest theoretically possible) and similarly 




1 V ' 2 ^ 



FIG. 3: Heterogeneous hierarchical system corresponding to 
the plots in Fig. [H In this figure, the 1024 node system is di- 
vided into a three-level hierarchy. Level 3 has 59 communities 
with sizes from 10 to 24 nodes. Level 2 has 16 communities 
with sizes from 26 to 95 nodes. Level 1 is the completely 
merged system of 1024 nodes. The average edge density is 
p = 0.054. This system has 28185 edges. 




(a) Plot of information measures In, I and the community 
number q vs the Potts model weight 7 in Eq. Q for the 
three-level heterogeneous hierarchy depicted in Fig. [3] 




(b)Plot of information measures V, H and the community 
number q vs Potts model weight 7 in Eq. (pQ) for the 
three-level heterogeneous hierarchy depicted in Fig. [3] . 

FIG. 4: Plot of information measures In, V, H and / vs 
the Potts model weight 7 in Fig. [3] In panel(a), the peak 
(plateau) In denoted by the arrows correspond to levels 2 
and 3 of the hierarchy depicted in Fig. [3] Similarly in panel 
(b), the minimal V values, indicated by arrows, accurately 
correspond to levels 2 and 3 of the hierarchy. The number 
of communities q is 16 and 59 in disparate plateau regions 
(denoted by the arrows) in both panels. These communities 
assignments (and, obviously, also their numbers) are exactly 
the same as those of the communities in levels 2 and 3 of 
the original hierarchical graph of Fig. [3] In panels (a) and 
(b), both the mutual information / and the Shannon entropy 
H display a plateau behavior corresponding to the correct 
solutions. 



the corresponding minima in panel(b) have a variation 
of information V = (the smallest value possible) for 
the same range of 7 values. These extreme values of In 
and V indicate perfect correlations among the replicas for 
both levels of the hierarchy. The "plateaus" in H , / and q 
are also important indicators of system structure. These 
plateau (and more general extrema elsewhere) illustrate 
when the system is insensitive to parameter changes and 
approaches stable solutions. In Sect ion IXl (and in Eq. ([2T|) 
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in particular), we will discuss this more generally in the 
context of the phase diagram of the community detection 
problem. 



VIII. REPLICA CORRELATIONS AS WEIGHTS 
IN A GRAPH 

Within the multiresolution method, significant struc- 
tures are identified by strongly-correlated replicas (mul- 
tiple copies of the studied system). Thus, if a node pair 
is always in the same community in all replicas, the two 
nodes must have strong preference to be connected or 
have a large edge weight. Similarly, if a node pair is not 
always in the same community in all replicas, they must 
have preference not to be connected or have a small edge 
weight. We re- assign edge weights based on the correla- 
tions between replicas. 

Specifically, we first generate r replicas by permuting 
the "symmetric" initialized state of one node per cluster 
of the studied system, then apply our community detec- 
tion algorithm to each replica and record the community 
membership for each node. We then calculate the proba- 
bility of each node pair based on the statistics of replicas. 
The probability is defined as follows 



(17) 



where 



exp(-|rf -r?|/l) 

N%Ng ' [ ) 



In Eq. (fT8]h when node i and j belongs to the same 
community in replica a, i.e., S a ^ iCr ^ = 1, oofj = 1. When 
node i and j are not in the same community in replica a, 
i.e., £cr«,cr« = (we use A and B to represent these two 
different communities, where i G A and j G B. and 
Ng denote the size of cluster A and B in replica a), oofj = 

exp(— \r a — r a \/£) 

N a Na 3 • As throughout, |rf — r*f\ is the distance 



between node i and j in replica a. In Eq. ((TTj), we sum 
the probability in each replica to define the edge weight. 
The assigned weights given by Eq. ([TTj) are based on a 
frequency type inference. Although we will not report on 
it in this work, it is possible to perform Bayesian analysis 
with weights ("priors") that are derived from a variant 
of Eq. ([18]) : this enables an inference of the correlations 
from the sequence of results concerning the correlations 
between nodes i and j in a sequence of different replicas. 

In unweighted graphs, we connect nodes if the edge 
weight between the node pair is larger than some thresh- 
old value p in Eq. (pQ), i.e., 



(19) 



71 = 9 J2( P ~ \ Q (Pij -P)+ ^(P ~ Pij))] 5 ^ a jl 20 ) 

Z a=l 

That is, when there is a high probability^, relative to 
a background threshold that nodes i and j are linked, 
we assign a positive edge weight to the link (ij) of size 
(Pij ~p)- Similarly, if the probability of a link (ij) is low, 
we assign a negative weight of size j(pij —p)- 

Armed with Eq. (|2Q|) , we then minimize in an identical 
fashion to the minimization of Eq. (|2]) that we discussed 
earlier. Specifically, we follow the 4 steps outlined in 
Section |VI] for non multi-scale images and the 7 steps 
of Section IVII Bl in the analysis of general multi-scale 
systems. 



IX. SUMMARY OF PARAMETERS 

We now very briefly collect and list anew the param- 
eters that define our Hamiltonians and appear in our 
methods. 

• The resolution parameter 7 in Eqs.fjTJ [2j [20]). This 
parameter sets the graph scale over which we search for 
communities. This parameter is held fixed (typically with 
a value of 7 = 0(1)) in the community detection method 
and varies within our multi-scale analysis. We determine 
the optimal value of 7 by determining the local extrema 
of the average information theory overlaps between repli- 
cas. 

• The spatial scale t in Eq.([7j). Similar to the more 
general graph scale set by 7, we may determine opti- 
mal t by examining extrema in the average inter-replica 
information theory correlations. In practice, in all but 
the hardest cases (i.e., the case of the dalmatian dog in 
Fig.(pQ)), we ignored this scale and fixed t to be infinite. 
Fixing I = 1 led to good results in the analysis of the 
dalmatian dog. 

• The spatial cutoff scale A for defining link weights- 
see the brief discussions after Eqs.([6l HOI)- Whenever the 
spatial distance between two sites or blocks exceeded a 
threshold distance A we set the link weight to be zero. 
We did not tune this parameter in any of the calculations. 
It was fixed to the value of A = 30. 

• The scale of the block size L XjV introduced in Section 



In weighted graph, the analog of Eq. (|2]) is the Hamil- 
tonian given by 



IIVB li This parameter is far smaller than the image size 
N x x N yi yet large enough to cover the image features. 
We usually set L x x L y to be 9 x 9 for an image size 
N x x N y of around 200 x 200. 

• The background intensities V in Eq.((2]) and p in 
Eq. (|20|) . Similar to the graph scale set by 7 and £, we 
may determine the optimal V and p by observing the 
local extrema of the average information correlations be- 
tween replicas. 

As we will elaborate on briefly, all optimal parameters 
can be found by determining the local extrema of the 
information theory correlations that signify no change in 
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structure over variations of scale. In reality, we may fix 
some parameters and vary others-usually, A fixed as 30, 
L x x L y in the range of 7x7 to 11x11, and 7, V and p 
been changed. 

As an aside, in this brief paragraph, we briefly note 
for readers inclined towards spin glass physics and opti- 
mization theory that, in principle, in the large N limit 
(images with a large number of pixels) the effective op- 
timal values for the likes of the parameters listed above 
may be derived by solving the so-called "cavity" equa- 
tions [48j that capture the maximal inference possible 
(in their application without the aid of replicas that we 
introduced here) [5l|, [52|. In the current context, in ap- 
plying these equations anew to image segmentation, we 
arrive at the maximal inference possible of objects in an 
image. While these equations are tractable for simple 
cases, solving these equations is relatively forbidding for 
general cases. In practice, we thus efficiently directly ex- 
amine our Potts model Hamiltonians of Eqs. (jTJ [2j [20]) 
and, when needed, directly infer optimal values of the 
parameters by examining inter-replica correlations as de- 
scribed in the earlier sections. This will be expanded 
on in the next section (specifically, in Eq. (|2T]) ). [Detailed 
applications of this method are provided in Section lXI G[ ] 



X. COMPUTATIONAL COMPLEXITY, THE 
PHASE DIAGRAM, AND THE 
DETERMINATION OF OPTIMAL PARAMETERS 

Our community detection algorithm is very rapid. For 
a system with L links, the typical convergence time scales 
as 0(L 1,3 ) [44[. In an image with TV pixels, with all of 
the constants A,L x ^ y = 0(1) (i.e., not scaling with the 
system size), the number of links L ~ N. 

Our general multi-scale community detection algo- 
rithm (that with varying 7) has a convergence time 
r ~ Z, 1,3 lniV [38]. Thus, generally, for an image of size 
TV, the convergence time r ~ TV 13 In TV. Rapid conver- 
gence occurs in all but the "hard phase" of the commu- 
nity detection problem. 

Specifically, we numerically investigated the phase di- 
agram as a function of noise and temperature (i.e., when 
different configurations are weighted with a Boltzmann 
factor exp(— f3H) with f3 = 1/T at a temperature T for 
general graphs with an arbitrary number of clusters in 
|53j.) Related analytic calculations were done for sparse 
graphs in [52j |. In particular, in these and earlier works 
[381 Hi| it was found that there is a phase transition 
between the detectable and undetectable clusters. The 
detectable phase further splinters into an "easy" and a 
"hard" phase. These three phases in the community de- 
tection problem constitute analogs of three related phases 
in the "SAT-hard-unSAT" in k-SAT problem ga, EH- 
The found phase diagram [53] exhibits universal features. 
Increasing the temperature can aid the detection of clus- 
ters [53[ • The universal features of the phase diagram and 
the known cascade of transitions that appear on introduc- 



ing temperature enable better confidence in the results of 
the communit y d etection algorithm. One of the central 
results of Ref.[53] is that the "easy" solvable phase(s) of 
the community detection problem which leads to correct 
relevant solutions (i.e., not noisy partitions of a struc- 
tureless system) universally appear in a "flat" [38|, [53| 
phase (s) [see also the flat information theoretic curves 
in Fig. (J2J) and related discussion in Sect ion (j VII B|) ] as 
ascertained by the inter-replica averages of all thermody- 
namic and information theoretic quantities {(Q)}. These 
may correspond to the internal energy (Q = H), aver- 
age Shannon entropy (Q = H), average inter-replica nor- 
malized mutual information and variation of information 
(Q — /jv, V), the complexity (Q = S) [48| or an associ- 
ated "susceptibility" (Q = x) [SH that monitors the 
onset of large complexity. [This susceptibility will be de- 
fined with the aid in the change in the average normalized 
mutual information In as a function of the number of tri- 
als t. It is defined as x( n ) — [/iv(^ = tl) — I^it = 4)].] 
That is, with z denoting a set of generalized parame- 
ters (e.g., artificially added additional noise in networks 
(z = Pout) [13 5 temperature (z = T) [53], or resolution 
parameter (z = 7) [38[), pertinent partitions appear for 
those values of the parameters z for which 

As alluded to above, a particular realization of Eq. ([2T|) 
appears in the hierarchal system discussed in Section 
IVH Bl wherein z = 7 and Q = In,V. In that case, 
Eq. ([2T]) was satisfied in well defined plateaus. 

When present, crisp solutions are furthermore gener- 
ally characterized by relatively high values of In, and 
these correspond to the "easy phase" of the image seg- 
mentation problem. In Sec. IXI G\ we will provide explicit 
analysis of the phase diagram and optimal parameters as 
they pertain to several example images. 

All of the results (except the ones in Sec. IXI G|) pre- 
sented below in the current manuscript were attained at 
zero temperature and may be improved by the incorpora- 
tion of thermal annealing as the results of [53] illustrate 
for general systems. 



XI. RESULTS 
A. Brain Image 

1. Unweighted graphs 

We start the review of the results of our methods by 
analyzing an unweighted graph (Eq. (pQ)) for the grey- 
scale brain image as shown in Fig. [5] We will assign 
edges between pixels only if the intensity difference is 
less than the threshold V = 16 as denoted in panel (a) 
of Fig. El The algorithm uses Eq. (pp) to solve for a range 
of resolution parameters 7 in the interval [70,7/]- In the 
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(a)The result of the "multiresolution" algorithm applied 
to the unweighted brain image shown in (b): In, V and q 
in terms of the resolution 7. 




(b)The unweighted result of the brain images with 
different 7, which correspond to 71 = 0.1, 72 = 0.8, 
73 = 79.4. 

FIG. 5: [Color Online.] The plot of the normalized mutual 
information In, variation of information V and the number 
of communities q as a function of 7 for the brain image. This 
image is reproduced with permission from the Iowa Neuro- 
radiology Library. The axis for 7 is on a logarithmic scale. 
There are three prominent peaks in the V curve. We apply 
our community detection algorithm to the grey-scale brain 
image at these three values of 7s. The corresponding results 
are shown in panel (b) . Note that the results show three- level 
hierarchy as 7 varies. 



particular case in Fig. El 70 = 1(T 3 and 7/ = 100. There 
are two more input parameters that are needed in our 
algorithm: the number of independent replicas r that 
will be solved at each tested resolution and the number 
of trials per replica t. We use r = 10 and t = 4 in Fig. [5] 
respectively. 
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1E-4 1E-3 0.01 0.1 1 10 



Y 

(a) The result of "multiresolution" algorithm for the 
weighted brain image shown in (b): In, V and q in terms 
of the resolution 7. 




y 2 =0.64 y 3 =64 



(b)The weighted result of the brain images with different 
7, which correspond to 71 = 0.1, 72 = 0.64, 73 = 64. 

FIG. 6: [Color Online.] The result of "multiresolution" for 
the weighted brain image shown in panel (b). In panel (a), 
the "multiresolution" result here behaves different from Fig. 
[5] but keeps the same trend. The structure is only stable in 
the resolution range of 7 < 0.01, compared to the wider range 
of 7 < 0.1 in Fig. [5] This illustrates that the weighted graph 
is more sensitive to the change of resolution. 



As noted earlier (see Section EJ), for each replica, we 
select the lowest energy solution among the t trials. The 
r replicas are generated by reordering the "symmetric" 
initialized state of one node per community. We then 
use the information based measures (i.e., In or V) to 
determine the multiresolution structure. 

The plots of In, V and q as a function of 7 in Fig. 
[5] exhibit non- trivial behaviors. Extrema in In and V 
correspond to jumps in the number of communities q. 
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In the low 7 region, i.e., 7 < 0.1, the number of com- 
munities is stable. However, when 7 > 0.1, the number 
of communities q sharply increase. This indicates that 
the structure changes rapidly as the resolution 7 varies. 
There are three prominent peaks in the V (variation of 
information) curve. We show the corresponding images 
at these resolutions, that is in panel(b) in Fig. [5] These 
corresponding segmented images show more and more 
sophisticated structures. The lower right image at a res- 
olution of 7 = 79.4 shows the information in detail. Dif- 
ferent colors in the image correspond to different clusters. 
There are, at least, five contours surrounding the tumor, 
that denote the degree by which the tissue was pushed by 
the tumor. The lower left image at the resolution 7 = 0.8 
is less detailed than the one on the right. Nevertheless, 
it retains the details surrounding the tumor. If we fur- 
ther decrease 7, the upper right image at the resolution 
7 = 0.1 will not keep the details of the tumor boundary, 
only the rough location of the tumor. Thus, neither too 
large nor too small resolutions are appropriate for tumor 
detection in this image. The resolution around 7 = 0.8 is 
the most suitable in this case. This is in accord with our 
general found maxim in Section IIXI concerning a value of 
7 = 1. We re- iterate that, in general, the optimal value 
of 7 is found by Eq. ([2T]) (an example of which is manifest 
in the information theory plateaus discussed in Section 
I VII Bp . In Section Kl G\ we will discuss, in depth, how 
the optimal values of 7 may be determined in (weighted) 
example systems. 



2. Weighted graphs 

In Figs. ((6j [7]), we provide the "mult iresolut ion" re- 
sults for the weighted graph (Eq. (j2j)) of 7 and V for the 
brain image. Both the resolution 7 and the threshold V 
control the hierarchy structures: the peaks in the normal- 
ized mutual information In and variation of information 
V always correspond to the jumps in the number of com- 
munities q. The jumps in q correlate with the changes 
in hierarchal structures on different scales. We can com- 
bine both parameters to obtain the desirable results in 
the test images. See, e.g., the 3D plot of Ijst(V,j). 

The results of our method with weighted edges are 
more sensitive to the changes of parameters (as seen 
from a comparison of Fig. [6] with Fig. [5j). According to 
Eq. (|2]), edges (ij) with small (or large) difference \Dij\ 
will decrease (or increase) the energy by \V — Dij\ (or 
j\V — Dij\). However, if the unweighted graphs and the 
Potts model with discrete weights (Eq. (pQ)) are applied, 
the edges with small or large "color" difference will de- 
crease or increase the energy by the amount of 1 or 7. 
Thus, considerable information (e.g., the "color" of each 
pixel) is omitted when using an unweighted graph ap- 
proach. 




-10 10 20 30 40 50 60 

V 

(a) The result of "mult iresolut ion" algorithm for the 
weighted brain image shown in (b): In, V and q in terms 
of the threshold V. 




V 2 = 21 V 3 = 34 



(b)The weighted result of the images with different V, 
which correspond to V\ = 14, V2 = 21, V3 = 34. 

FIG. 7: [Color Online.] The "multiresolution'' result also 
shows the hierarchy structure as the threshold V varies, as 
in Fig. [6] Higher U V" corresponds to the lower "7", which 
means pixels intend to merge in higher U V" . The structure 
is stable in the range of V > 25, below which the structure is 
sensitive. 



B. A painting by Salvador Dali 

We next apply our multiresolution community detec- 
tion algorithm to the images that are by construction 
truly multi-scale. The results at different resolutions are 
shown in Fig. [8j The original image is that of Salvador 
Dali's famous painting "Gala contemplating the Mediter- 
ranean sea which at twenty meters becomes a portrait of 
Abraham Lincoln" . Our algorithm perfectly detected the 
portrait of Lincoln at low resolution as shown in Fig. [8] 
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(a)The variation of information V as a function of 
resolution 7 for the image shown in Panel (b) . 




(b)The original image and the corresponding segmentation for the 
specific resolution marked (I) in panel(a). 




(c)The corresponding images in the specific 
resolution marked (II), (III), (IV) and (V) in 
panel(a). 

FIG. 8: [Color Online.] The specific image is from [SI]. At 
close distance, this is "Gala contemplating the Mediterranean 
sea" while at larger distance is "a portrait of Abraham Lin- 
coln". Panel(a) shows the variation of information as a func- 
tion of resolution. We pick the resolution at each "peak" 
position and apply our algorithm at these particular resolu- 
tions. Panels (b) and (c) show the resulted images at the 
corresponding resolutions marked in panel (a). Note that at 
low resolution, the resulting segmentation clearly depicts "the 
portrait of Abraham Lincoln" as shown in panel (b) on the 
right. In particular, notwithstanding noise, as 7 increases, the 
segmentation results show more details and we could detect 
the lady in the middle in (II)- (V) of Panel(c). 



in the segmentation result appearing in panel (I) of (b). 
As the resolution parameter 7 increases, the algorithm 
is able to detect more details. However, due to the non- 
uniform color and the similarity of the surrounding col- 
ors to those of the targets, the results are very noisy. At 
the threshold of V = 20, the algorithm has difficulty in 
merging pixels to reproduce the lady in the image. For 
example, in image (II) in Panel (c), the lady's legs are 
merged into the background. In image (III), only one leg 
is detected. In images (IV) and (V), both legs can be 
detected but belong to different clusters. 



C. Benchmarks 

In order to assess the success of our method and ascer- 
tain general features, we applied it to standard bench- 
marks. In particular, we examined two known bench- 
marks: (i) The Berkeley image segmentation benchmark 
and (ii) that of Microsoft Research. 



1. Berkeley Image Segmentation Benchmark 

We were able to accurately detect the targets in test 
images, as in Figs.(|9j [TU|) . The original images in Fig. 
[9] were downloaded from the Berkeley image segmenta- 
tion benchmark BSDS300 [55], and those of Fig. ITTTI are 
downloaded from the Microsoft Research [56]. We will 
now compare our results with the results by other al- 
gorithms. The Berkeley image segmentation benchmark 
provides the platform to compare the boundary detection 
algorithms by an "F- measure". This quantity is defined 
as 



F-measure 



2 x Precision x Recall 
Precision+Recall 



(22) 



"Recall" is computed as the fraction of correct instances 
among all instances that actually belong to the relevant 
subset, while "Precision" is the fraction of correct in- 
stances among those that the algorithm believes to be- 
long to the relevant subset. Thus, we have to draw the 
boundaries in our results and compute the F-measure. 
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F- Absolute Potts Model 


F-Global Probability of boundary 


a 


0.79 


0.78 


b 


0.94 


0.91 


c 


0.82 


0.74 


d 


0.79 


0.83 


e 


0.75 


0.60 



TABLE I: The comparison of "F measure" for Fig. [9] by our 
community detection algorithm ("F- Absolute Potts Model") 
with the algorithm "Global Probability of boundary" ( "gPb" ) 
[571 158^ which has the highest score in the Berkeley image seg- 
mentation benchmark ("F-Global Probability of boundary" ) . 
The higher F- value corresponds to the better detection. Note 
that our algorithm is performing better than the "gPb algo- 
rithm" in almost all images except the fourth one. Our fourth 
(d) image gets lower score is mostly because there are dots 
in the lower grass place. These small dots will lead to small 
high accuracy features. These features are unexpected in the 
ground truth and thus lower the F- value. 



FIG. 9: [Color Online.] Image segmentation results of our 
algorithm when tested with examples from the Berkeley 
BSDS300 benchmark. Shown, in the left column, are the orig- 
inal images. The central column contains the results of our 
method. The right column provides the boundaries of the im- 
ages in the middle by running "EdgeDetect" of Mathematica 
on the results of our run in the central column. The parame- 
ters of community detection algorithm used for these images 
are: in (a), 7 = 0.001, V = 15. In panel (b), 7 = 0.0001, 
V = 20. In (c), 7_= 0.001, V = 20. In (d), 7 = 0.01, V = 15. 
In (e), 7 = 0.01, V = 15. We performed the boundary detec- 
tion on the results of our community detection algorithm (i.e., 
the central column) and employed the "F-measure" accuracy 
parameter in order to compare the results of our algorithm 
with earlier results reported for the Berkeley image segmen- 
tation benchmark (shown in Table. [J). 



We use the tool "EdgeDetect" of Mathematica software 
to draw the boundaries within our region detection re- 
sults, as shown in the right column in Fig. [9j The 
comparison of the "F-measure" of our algorithm ("F- 
Absolute Potts Model") and the best algorithm in the 
benchmark ( "F-Global Probability of boundary" ) ^ [EH 
is shown in Table. HI On the whole, our results are better 
than the algorithm of the Berkeley group. 



2. Microsoft Research Benchmarks 

In Fig. [TUJ we compare our results (in the rightmost 
column) with the ground truths provided by Microsoft 
Research (the central column). By adjusting the 7 and V 
values, we can merge the background pixels and highlight 
the target. In the segmentation of the image of the flower 
in the first row, 7 = 0.001 and V = 20. For both the 
picnic table in the middle row and that of the two sheep 
in the bottom row, we set 7 = 0.01 and V = 15. 




FIG. 10: [Color Online.] The results of some image segmenta- 
tions by our Potts model (Eq. (|2J)) and community detection 
algorithm ([Hj]). The images are downloaded from the web- 
site of Microsoft Research ((56|). The left column are the 
original images. The central column are the ground truths 
defined by the website of Microsoft Research (|5q])« which are 
the desirable image segmentation results. The right column 
are the segmentation results by our algorithm. The parame- 
ters used for each image are: (1)7 = 0.001, V = 20 for the 
flower image. (2) 7 = 0.01, V = 15 for the image of the picnic 
table. (3) Similarly, 7 = 0.01, V = 15 for the image of the 
two sheep. Note that our algorithm works very well for this 
kind of images in which the color is nearly uniform within 
each object. 



D. Detection of quasi-periodic structure in 
quasicrystals 

Quasicrystals [59| are ordered but not periodic (hence 
the name "quasi"). In Fig. [TTJ the image in row (a) 
is such a quasi-crystal formed by "Penrose tiling". We 
applied the Fourier transform method to reveal the cor- 
responding underlying structures. In row (a), the im- 
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FIG. 11: [Color Online.] Quasicrystal images are displayed in 
panels (I). The corresponding image segmentation results by 
our algorithm are shown in (II) . In (III) , we connect the basic 
object by line, resulting in large basic blocks. This process 
can be repeated recursively leading to larger and larger scale 
structures. Note that we are able to reveal the underlying 
quasi-periodic structures in both row (a) (the original image 
in (I) is from [60l ]) and (b) (the original image in (I) is from 
[6l[). We show the first Penrose tiling (tiling "PI") in (a), 
and the structural motif of the (3 2 .4.3.4) Archimedean tiling 
in (b). 



age marked by (I) is the original image (downloaded 
from [60]), the one with notation (II) is the result of 
our algorithm, and (III) is the image of (II) with the 
connections of the nearest neighbor nodes. The images 
marked by (II) and (III) show the first Penrose tiling 
(tiling PI). Penrose's first tiling employs a five-pointed 
pentagram, 3/5 pentagram shape and a thin rhombus. 
Similarly, the result images of panels (II) and (III) in 
row (b) reveal the underlying structure of the superlat- 
tice with AB4 stoichiometry and the structural motif of 
the (3 2 .4.3.4) "Archimedean tiling" of the original im- 
age (I) (from [6l|). The Archimedean tiling displayed in 
image (III) of row (b) of Fig. [TT] employs squares and tri- 
angles. It is straightforward to analyze the quasi-periodic 
structure by applying our image segmentation algorithm 
as shown in Fig. [Til By iterating the scheme outlined 
herein, structure on larger and larger scales was revealed. 



E. Images with spatially varying intensities 

If the target is similar to the background (as in, e.g., 
animal camouflage), then the simplest initialization of 
edges with linear weights will, generally, not suffice. For 
example, in Fig. [12] the zebra appears with black and 
white stripes. It is hard to directly detect the stripes of 
the zebra because of the large "color" difference between 
the black and white stripes of the zebra. Fig. [14] has the 
similar stripe-shaped background which is very difficult 
to distinguish from the zebra itself by using the weights 




FIG. 12: [Color Online.] The image segmentation results by 
the community detection algorithm with Fourier weights as 
described in Section IIVB 31 Some of the images are down- 
loaded from the Microsoft Research ( 56]) and some of them 
are download from the Berkeley image segmentation bench- 
mark ([55|]). The left column contains the original images. 
The central column (apart from the last two rows) provides 
the "ground truths". The right images on the right are our 
results. The parameters used in each image are: (1) 7 = 0.01, 
V = -300 for the tree image. (2) 7 = 0.1 and V = -300 
for the car image. (3) 7 = 0.01 and V = —400 for the bench 
image. (4) 7 = 0.01, V — —100 for the image of corn. (5) 
7 = 0.1, V = —900 for the zebra image. Even though the 
color is not uniform inside the targets, we can nevertheless 
easily detect the targets by this method. 



of Eqs.(0[8]) for the edges. Towards this end, we will next 
employ the Fourier transform method of Sec. IIV B 31 

As seen in Fig. [12j the original images are not uniform. 
Rather, these images are composed of different basic com- 
ponents such as stripes or spots, etc. With the aid of 
Fourier transform within each block, as discussed in Sec- 
tion IIVB 31 we are able to detect the target. For some 
of the images such as the second one in Fig. [12j when 
the target is composed of more than one uniform color 
or style, our community detection algorithm is able to 
detect the boundaries, but the regions inside the bound- 
ary are hard to merge. This is because the block size is 
smaller than that needed to cover both the target and 
the background. That is, block size of L x x L y = 5 x 5 is 
much smaller than the image size of N x x N y = 320 x 213 
in the car image in the second row, so most of the blocks 
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FIG. 13: [Color Online.] The results of the image segmenta- 
tion for a "camouflaged image" . The image of the leopard is 
from ( 62]), the lizard is provided in the Berkeley image seg- 
mentation benchmark f [55l| ) . and the last image is from the 
website of the EECS department of Berkeley (H). The 
parameters for the shown segmentations are: (1) 7 = 1, 
V = -700 for the image of the leopard, (2) 7 = 0.1, V = -500 
in the image of the lizard, and (3) 7 = 1, V = —1100 for the 
zebra image. 



are within one color of the target (car) or the background 
(ground). However, the dominant Fourier wave- vector of 
the region within one color component of the car is simi- 
lar to that of the ground. Therefore, the algorithm always 
treats them as the same cluster, rather than merging the 
region inside the car with the boundary. 

In other instances (e.g., all the other rows except the 
second in Fig. [12]), the targets are markedly different 
from the backgrounds. Following the scheme discussed in 
Section [X] (that will be fleshed out in Section Kl G|h we 
may always optimize parameters such as the resolution, 
threshold, or the block size to obtain better segmenta- 
tion. 



F. Detection of camouflaged objects 

In the images of Fig. [12j the target objects are very 
different from their background. However, there are im- 
ages wherein (camouflaged) objects are similar to their 
background. In what follows, we will report on the re- 
sults of our community detection algorithm when these 
challenging images were analyzed. In all of the cases be- 
low in Section KlF 11 the edge weights were initialized by 
the Fourier amplitudes discussed earlier (Section ttV B 3|) . 




600 800 1000 1200 1400 1600 1800 
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(a) The variation of information V as a function of the 
negative threshold — V for the zebra image in panel (b). 
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(b)The weighted results of the zebra images at the 
corresponding thresholds: V\ = —760 (I), V2 = —1040 
(II), and V 3 = -1200 (III). 

FIG. 14: [Color Online.] The "multiresolution" result of ze- 
bra with fixed community number q — 3 and resolution 7=1. 
In panel(a), we plot the variation of information V as a func- 
tion of negative threshold —V. The peaks in V correspond 
to the changes of structures. We choose three peaks and run 
the algorithm at these three particular thresholds, and the 
result images are shown in panel (b). As \V\ increases, less 
regions in the zebra merge to the background, and the bound- 
ary becomes more clear. If we increase the threshold further, 
the result is more noisy as the last image of V = —1200 (III) 
shows. 



In the case of the dalmatian dog image in Section IXI F 2\ 
we further employed the method of average intensity dif- 
ference between blocks discussed in Section HVB 21 In all 
cases but this last one of the dalmatian dog, we fixed the 
length scale parameter i of Section [IV B 31 to be infinite. 
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(a) The variation of information V as a function 
of length t at 7 = 0.1 




(b)The normalized mutual information In as a 
function of length t at 7 = 0.1 




(d)The variation of information V as a function 
of length t at 7 = 0.05 




(e)The normalized mutual information In as a 
function of length t at 7 = 0.05 




//=/ h=1.29 

(c)The corresponding image segmentation result at the 

extremum of V/I N in panel (a)/(b) (f)The corresponding image segmentation results at the 

extremum, and at a point close to the peak of V /In in 
panel (d)/(e). 



1. Images of a leopard, a lizard, and a zebra 

"Camouflage" refers to a method of hiding. It allows 
for an otherwise visible organism or object to remain un- 
noticed by blending with its environment. The leopard 
in the first row of Fig. [13] is color camouflaged. With our 
algorithm, we are able to detect most parts of the leop- 
ard except the head. The lizard in the second row uses 
not only the color camouflage but also the style camou- 
flage, both the lizard and the ground are composed of 
grey spots. We can detect the lizard. The zebra in the 
bottom row uses the camouflage- both the background 
and the zebra have black-and-white stripes. Our result is 
very accurate, even though the algorithm treats the mid- 



FIG. 15: [Color Online.] Results of our algorithm as a func- 
tion of the length scale £ in Eq. J7J) for the dalmatian dog 
image. Plots of the variation of information and the normal- 
ized mutual information (V, In) as a function of the length 
scale £ appear in panels (a, b)(at resolution of 7 = 0.1) and 
(d, e) (7 = 0.05 ). Panel (c) shows the original image. As seen 
in panels (a,b), a coincident local maximum of V and local 
minimum of In appears (for 7 = 0.1) at £ = 0.63. Similarly, 
panel (f) shows the images corresponding to the peak of V 
(coincident with a local minimum of In) in panel (d) (and 
(e)) at £2 = 1.29 (and 7 = 0.05). We examine the results for 
£\ — 1 in panel (f). We are able to detect the body and the 
back two legs of the dog, even though with some "bleeding" in 
panel (f). In (c), we are detecting well except for the inclusion 
of some "shade" noise under the body. 
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(a)A 3d plot of the normalized mutual 
information In as a function of log(^) 
and log (7). 




logiy) 



(b)The 3d plot of the variation of 
information V as the function of \og(£) 
and log (7) 



die portion of the zebra (the position of the "hole" ) as the 
background by mistake. This is because, in this region, 
the stripes within the zebra are very hard to distinguish 
from the stripes in the background, they are both regular 
and vertical. 

We applied the "multiresolution" algorithm to the ze- 
bra image in the last row of Fig. [13] as shown in Fig. 
[Ml The number of communities is q = 3, the resolution 
parameter 7 = 1 and the threshold V was varied from 
V = -600 to V = -1800. In the low \V\ area, some 
regions inside the zebra tend to merge into the back- 
ground (the image with the threshold V = —760). As 
the background threshold \ V\ increases in magnitude, the 
boundary of the zebra becomes sharper (the shown seg- 
mentation corresponds to a threshold of V = —1040). 
For yet larger values of | V"| , the results are noisy (the im- 
age with the threshold V = —1200). Thus, in the range 
760 < |y| < 1200, we obtain the clear detection seen in 
the last row of Fig. [13l 



2. Dalmatian dog 

The camouflaged dalmatian dog in panel (c) of Fig. 
[T5l (and Fig. [J) is a particularly challenging image. We 
invoke the method detailed in Sec. IIVB~2l to assign edge 




-1.4 

log{y) 



(c) Plots of the susceptibility x as the 
function of log(^) and log(7) 




(d)The Shannon entropy if as a 
function of \og(£) and log(7). 




(e)The energy E as a function of 
\og(l) and log (7). 

FIG. 16: Plots of In, V, \i H and energy E as the function 
of \og(£) and log (7) for the "dalmatian dog" image in Fig.[T5l 



weights. We then apply the mult iresolut ion algorithm 
to ascertain the length scale £ in Eq. (|7j). The inter- 
replica averages of the variation of information V and 
the normalized mutual information In are, respectively, 
shown in panels (a,b) and panels (d,e) of Fig. [15l These 
information theory overlaps indicate that, as a function 
of £, there are, broadly, two different regimes separated 
by a transition at £ ~ 1. We determine the value of £ at 
the local information theory extremum that is proximate 
to this transition and determine the edge weights set by 
this value oft (See Eq. ©.) In Section IXTGll we will 
illustrate how we may determine an optimal value of £. 
We segment the original image of the dalmatian dog 
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via our community detection algorithm as shown in pan- 
els (c) and (f) in Fig. [151 The result in panel (c) cor- 
responds to a resolution of 7 = 0.1. The image on the 
right in panel (c) is the superimposed image of our re- 
sult and the original image (on the left) at the particular 
length £ = 0.63. The "green" color corresponds to the 
dalmatian dog. The method is able to detect almost all 
the parts of the dog except the inclusion of "shade" noise 
under the body. The results in panel (f) correspond to 
a resolution 7 = 0.05. The image on the left in panel 
(f) is the superimposed image of our immediate running 
result and the original one at the length £\ = 1, which 
is close to the maximum of V (and the local minimum 
of In)- On the right, we provide the result for £2 = 1-29 
(a value of £ corresponding to a maximum of V and a 
minimum of In)- The "purple" color in the segmented 
image corresponds to the dalmatian dog. We are able 
to detect the body and the two legs in the back, even 
though with some "bleeding" . As we will discuss in the 
next subsection, it is possible to relate the contending 
solutions found in Fig. ([T5]) for different values of 7 and £ 
to the character of the phase diagram. 



G. Phase Diagram 

As previously alluded to in Sec. |Xj we investigated 
numerically the phase diagram and the character of the 
transitions of the community detection problem for gen- 
eral graphs in [53[. From this, we were able to dis- 
tinguish between the "easy", "hard" and "unsolvable" 
phases as well as additional transitions within contend- 
ing solutions within these phases (e.g., our discussion 
in Section I VII Bp . Strictly speaking, of course, differ- 
ent phases appear only in the thermodynamic limit of 
a large number of nodes (i.e., TV — >• 00). Nevertheless, 
for large enough systems (TV ^> 1), different phases are, 
essentially, manifest. As we will now illustrate, the analy- 
sis of the phase diagram enables the determination of the 
optimal parameters for the image segmentation problem. 
To make this connection lucid, we will, in this section, 
detail the phase diagrams of several of the images that 
we analyzed thus far. 



1. Phase diagram of the Potts model corresponding to the 
dalmatian dog image 

We will now analyze the thermodynamic and informa- 
tion theory measures as they pertain to the dalmatian 
dog image (Fig. [T5]) for a range of parameters. In a dis- 
parate analysis, in subsection IXI G 2[ we will extend this 
approach also to finite temperature (i.e., T > 0) where 
a heat bath algorithm was employed. Here, we will con- 
tent ourselves with the study of the zero temperature 
case that we have focused on thus far. 

Plots of the normalized mutual information In, varia- 
tion of information V, susceptibility x, entropy H, and 



the energy E are displayed in Fig. [16] We set the 
background intensity to V = 15. The block size is 
L x x L y = 11x11. We then varied the resolution 7 and 
the spatial scale £ within a domain given by 7 G [0.01, 0.1] 
and £ G [0.4,4]. In Fig. [16] all logarithms are in the com- 
mon basis (i.e., log 10 ). 

Several local extrema are manifest in Fig. [T6J In the 
context of the data to be presented below, the quantity 
Q of Eq. (f2T|) can be In, V , x> H or E, and z may be 
7 or £. Examining the squares of the gradients of these 
quantities, as depicted in Fig. [T7J aids the identification 
of more sharply defined extrema and broad regions of the 
parameter space that correspond to different phases. 

In Fig. [T7J we compute the squares of the gradients of 
In, V , Xi H and E in panels (a) through (e). Panel 
(f) shows the sum of the squares of the gradients of 
In, V and x- A red dot denotes parameters for a 
"good" image segmentation with the parameter pair be- 
ing ( 7 ,£) = (0.05,1) (or (log( 7 ), \og(£)) = (-1.3,0) cor- 
responding to the left hand segmentation in panel (f) 
of Fig. [15]). Clearly, the red dot is located at the lo- 
cal minimum in each panel. This establishes the corre- 
spondence between the optimal parameters and the gen- 
eral structure of the information theoretic and thermo- 
dynamic quantities. 

As evinced in Fig. [T7J there is a local single minimum 
which is surrounded by several peaks in the 3D plots 
of the squares of the gradients of In, V (panel(a),(b)) 
and their sum (panel (f)). For the dalmatian dog image 
(Fig. [T5] setting Q in Eq. ([2T|) to be the square of the 
gradients efficiently locates optimal parameters. Note 
that the other contending solutions in Fig. ([T5]) relate 
naturally to the one at 7 = 0.05 and £ = 1. The £ = 
1.29 (i.e., \og{£) = 0.11) solution on the right hand side 
of panel (f) appears in the same "basin" as that of the 
£ = 1 solution. Indeed, both segmentations of panel (f) 
of Fig. ([T5]) share similar features. By contrast, the 
7 = 0.1 and £ = 0.63 (i.e., (log( 7 ) = -l,log(f) = -0.2)) 
segmentation result of panel (c) in Fig. ([T5]) relates to a 
different region. 



2. A finite temperature phase diagram 

Fig. [18] depicts the finite temperature (T > 0) phase 
diagram of the image of the bird of Fig. [19] We will find 
that for this easy image, the phase boundaries between 
the easy, hard, and unsolvable phases of the image are 
relatively sharply defined. 

In the context of the data to be presented, we fixed 
the background intensity V = 15, set the block size to be 
L x x L y = 1 x 1 and took the spatial scale £ — >• 00. The 
varying parameters are the resolution 7 and temperature 
T. Instead of applying our community detection algo- 
rithm at zero temperature, we will incorporate the finite 
temperature [53] in this section. The ranges of the 7 
and T values are [0.001,100] and [0,1000] respectively. 
In the panels of Fig. [T8j we show the normalized mutual 
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(a) The square of the gradient of In 
(panel (a) of Fig. fT6|) as a function of 
log(^) and log (7). 



Gvi 




(b)The square of the gradient of V 
(panel (b) of Fig. [T6]) as a function of 
log(^) and log (7). 




(c)The square of the gradient of x 
(panel(c) of Fig. I16|) as a function of 
\og(£) and log (7). 



information ijv, variation of information V, susceptibil- 
ity Xi energy E and Shannon entropy H as the function 
of the temperature T and the logarithm of the resolution 
log(7). 

We can clearly distinguish the "easy", "hard" and "un- 
solvable" phases from the 3D plots of In (panel (a)), V 
(panel (b)) and H (panel (e)). The label "A" in panel 
(a) marks the "easy" phase, where 7 G [0.001,0.3] for 
T G [0, 500] and 7 G [0.001, 0.01] for T G [500, 1000]. The 
"easy" phase becomes narrower as temperature increases. 
The corresponding image segmentation result shown in 
Fig. [T9l validates the label of the "easy" phase. The "A" 
image in Fig. [19] is obtained by running our community 
detection algorithm with the parameter pairs located in 
the area labeled by "A" in Fig. [181 The image segmen- 
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(d)The square of the gradient of H 
(panel (d) of Fig. 1160 as the function 
of \og(£) and log (7). 
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(e)The square of the gradient of E 



(panel (e) of Fig. [T6]) as the function 
of log(^) and log (7). 




(f)The sum of the squares of the 
gradients of In, V and x (panel 
(a),(b) and (c) of Fig. fit)) as the 
function of \og(£) and log(7). 



FIG. 17: Information theory and thermodynamic measures 
relating to the dalmatian dog image of Fig. 1151 The squares 
of the gradient of In, V, x> Hi E (panel (a)-(e)) and the sum 
of the squares of the gradients of In, V and % (panel (f)) as 
the function of log(^) and log (7). The red dot in each panel 
denotes the location of the parameters ( (log (^), log (7)) = 
(0, -1.3) (i.e., (£, 7 = (1, 0.05)) of the results in Fig. [151 This 
good segmentation found for these parameters correlates with 
a local minimum within each panel. 



tation denoted by "A" can perfectly detect the bird and 
the background. The bird is essentially composed of two 
clusters and the background forms one contiguous clus- 
ter. This reflects the true composition of the original 
image on the upper left. Thus, the bird image can be 
perfectly segmented in an unsupervised way when choos- 
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(a) The normalized mutual information 
In as a function of the resolution 
log (7) and temperature T. 




(b)The variation of information V as 
the function of the resolution log(7) 
and temperature T. 




(c)The susceptibility x as the function 
of the resolution log(7) and 
temperature T. 



ing parameters to be in the "A" region (corresponding to 
the computationally "easy" phase ). 

The region surrounding point "B" in panel (b) in Fig. 
n~8l denotes the "hard" phase, where 7 is in the range 
of [0.3,100] and T in the range of [0,500]. Within the 
"hard" phase, as the corresponding image labeled by 
"B" in Fig. [19] illustrates, the bird is composed of nu- 
merous small clusters with the background still forming 
one cluster. In this phase, the image segmentation be- 
comes harder and some more complicated objects cannot 
be detected. 

The label "C" in panel (c) in Fig. HU denotes the "un- 
solvable" phase, where the range for 7 and T is about 




(d)The energy E as a function of the 
resolution log(7) and temperature T. 




(e)The Shannon entropy H as the 
function of the resolution log(7) and 
temperature T. 

FIG. 18: The normalized mutual information In, variation 
of information V, susceptibility x> energy E and Shannon 
entropy H as the function of the resolution log (7) and tem- 
perature T for the "bird" image in Fig. [19] In panel (a), we 
mark (i) the "easy" phase (where In is almost 1) as "A", (ii) 
the "hard" phase (where In decreases) by "B", and (iii) de- 
note the "unsolvable" phase (where In forms a plateau whose 
value is less than 1) by "C". The physical character of the 
"easy", "hard", and "unsolvable" phases is further evinced 
by the corresponding image segmentation results in Fig. [19] 
We can determine the signatures of the three phases in all 
panels apart from panel(c)-the 3d plot of the susceptibility x- 



[0.1, 100] and [500, 1000] respectively. The corresponding 
image in Fig. [19] labeled by "C" is composed of numerous 
small clusters for which it is virtually impossible to dis- 
tinguish the bird from the background. In this phase, the 
normalized mutual information In is far less than 1 (in- 
dicating, as expected, the low quality of segmentations). 

Other 3D plots in Fig. fT8l generally show similar phase 
transitions. Especially, the 3D entropy plot (panel (e)) 
vividly depicts accurate three phases and their clear 
boundaries. 



XII. CONCLUSIONS 

In summary, we applied a multi-scale replica inference 
based community detection algorithm to address unsu- 
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from a Bayesian analysis with prior probabilities for var- 
ious known patterns (or training sets), can be addressed 
along similar lines [63j. We conclude with a speculation. 
It may well be that, in real biological neural networks, 
parameters are adjusted such that the system is solvable 
for a generic expected input and critically poised next 
to the boundaries between different contending solutions 

Software. 

The software package for the "multi-resolution 
community detection" algorithm [38| that 
was used in this work is available at 
http://www.physics.wustl.edu/zohar/communitydetection/. 



FIG. 19: [Color Online.] The image segmentation results of 
the "bird" image. The original image is on the upper left. 
The segmentations denoted by "A" , "B" and "C" correspond 
to results with different parameter pairs (log (7), T) that are 
marked in panel (a) of Fig. 1181 Both results "A" and "B" 
are able to distinguish the "bird" from the "background". 
However, in panel (b), the "bird" is composed of numerous 
of small clusters. The segmentation "C" does not detect the 
"bird". The results shown here at points A, B, C correlate 
with the corresponding "easy- hard- unsolvable" phases in the 
phase diagram in Fig. [T8l 
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Appendix A: Improved F-value by removing small 
high precision features 



pervised image segmentation. The resolution parameters 
can be adjusted to reveal the targets in different levels 
of details determined by extrema and transitions. In 
the images with uniform targets, we distributed edge 
weights based on the color difference. For images with 
non-uniform targets, we applied a Fourier transforma- 
tion within blocks and assigned the edge weights based 
on an overlap. Our image segmentation results were 
shown to be, at least, as accurate as some of the best to 
date (see, e.g., Table. [I]) for images with both uniform 
and non-uniform targets. The images analyzed in this 
work cover a wide range of categories: animals, trees, 
flowers, cars, brain MRI images, etc. Our algorithm is 
specially suited for the detection of camouflage images. 
We illustrated the existence of the analogs of three 
computational phases ( "easy- hard- unsolvable" ) found 
in the satisfiability (fc-SAT) problem Hzl IH| in the 
image segmentation problem as it was formulated in 
our work. When the system exhibits a hierarchal or 
general multi-scale structure, transitions further appear 
between different contending solutions. With the aid 
of the structure of the general phase diagram, optimal 
parameters for the image segmentation analysis may 
be discerned. This general approach of relating the 
thermodynamic phase diagram to parameters to be used 
in an image segmentation analysis is not limited to the 
particular Potts model formulation for unsupervised 
image segmentation that was introduced in this work. 
In an upcoming work, we will illustrate how supervised 
image segmentation with edge weights that are inferred 



As seen in Sec. IXI C li our results in the first three im- 
ages except the last one are better than the correspond- 
ing ones by the best algorithm in the Berkeley Image 
Segmentation Benchmark. One possible reason to cause 
the worse result in the last image is that our algorithm is 
too accurate. For example, the top image in Fig. [20j our 
result could detect the small white spray, which becomes 
the dots in the background. These small dots will form 
small circles in the boundary image shown in the right 
column, which are unexpected from the groundtruth, 
thus will reduce the value of precision and F. (In this 
case, F = 0.56.) 

Merging these high precision small dots with the back- 
ground as, e.g., fleshed out in the second row in Fig. 
[20l leads to results that are equivalent to or better than 
those determined by the algorithm of global probability 
of boundary (gPb). A summary is presented in Table. 

mi 



Appendix B: The image segmentation corresponding 
to the mutual information (In) peak 

As emphasized throughout this work, we focus on 
inter-replica information theory overlap extrema. In 
some of the earlier examples, we discussed the results 
pertaining to variation of information maxima (often cor- 
relating with normalized mutual information minima). 
We now briefly discuss sample results for the normalized 
mutual information maxima. We provide one such ex- 
ample in Fig. [2TJ Herein, we plot In as a function of 7 
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(b) 



FIG. 20: [Color Online.] The image segmentation results by 
our algorithm. The original images in the left most column are 
downloaded from Berkeley Image segmentation benchmark. 
The central image in the first row/ the third row is the result 
of our algorithm at 7 = 0.01 and V = 20. The right image in 
the first/the third row is the boundary detection result of the 
corresponding central image by the software Mathematica. 
There are many dots/circles which denote the white spray in 
original image in the first row. The small dots/circles in the 
third row denote the shadow in the original image. We merge 
these small dots in the first and third row into the background 
and the results shown in the second and fourth row are more 
smooth and close to the groundtruth. This is confirmed by 
the larger F value shown in Table. [II] 





F-Our algorithm 


F-Our algorithm without noise 


F-gPb 


a 


0.56 


0.85 


0.82 


b 


0.65 


0.73 


0.74 



TABLE II: The F- measure of the images shown in Fig.l20l We 
provide the comparison with the results by algorithm Global 
Probability of Boundary (gPb). Note that after removing 
the small dots/noise in both images, the value of F increase 
significantly. After this merger, our results become equivalent 
to (or even better than) the best results to date. 



and provide the corresponding segmented images at the 
peaks of In- As shown before, in panels I-III of Fig. [2J 
we provide the image segmentation that correspond to 
the values of 7 for which the variation of information V 
exhibits a local maximum. In Fig. [2TJ we do the same 
for the normalized mutual information In- 




600 800 1000 1200 1400 1600 1800 

-V 

(a) The curve of In as a function of negative threshold 
— V for the zebra image in panel (b). 




(b)The weighted result of the zebra images at the 
corresponding thresholds: V\ = —680, V2 = —960, and 

V3 = -1100. 

FIG. 21: [Color Online] The "multiresolution" result of zebra 
with fixed community number q = 3 and resolution 7=1. 
In panel(a), we plot the normalized mutual information In 
as a function of negative threshold V. The peaks in In also 
correspond to the changes of structures. We choose three 
peaks and run the algorithm at these three particular thresh- 
olds, and the result images are shown in panel (b). As | 
increases, less regions in the zebra merge to the background, 
and the boundary becomes more clear. 



Appendix C: The image segmentation with negative 
and positive Fourier weight 

In this brief appendix, we wish to compare results ob- 
tained with the weights given by those of Eq. ([TT]) to 
those obtained when Vij is set to be of the same magni- 



tude as in Eq. (fTT]) but of opposite sign (referred to below 
as its "negative counterpart"). In the latter case, a large 
weight corresponds to a large overlap between pat- 
terns in blocks. Thus, minimizing the Hamiltonian will 
tend to fragment a nearly uniform background (for which 
the overlap between different blocks within is large) and 
will tend to group together regions that change. The 
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i ii in 

FIG. 22: [Color Online.] The image segmentation results (II 
and III) of the original camouflaged zebra in (I). In panel 
(II), we used the Fourier based edge weights of Eq. ([TT]) and 
with a negative background V = —1200 (Other parameters 
are 7=1, block size l x X l y = 11 x 11). (Ill) The resulting 
segmentation when the sign on the right hand side of Eq. (|11|) 
is flipped. Here, we applied a positive background V — 900 
(Other parameters are 7=1, block size l x x l y = 7 x 7). Both 
of the results shown here (i.e., II and III) are able to detect 
the zebra. 



results of the application of Eq. ([TT]) and that of its neg- 
ative counterpart are shown side by side in Fig. [22] II and 
III. In both cases, the zebra is successfully detected from 
the similar stripe-shaped background, as long as using 
the right parameters. In (II), the parameters used are 
as follows: the background V = —1200, the resolution 
parameter is 7 = 1 and the block size is l x x l y = 11 x 11. 
In (III), we use a positive background V = 900 but with 
a negative Eq. ([TT]) . resolution 7 = 1 and block size 
lx x ly = 7 x 7. The difference shown in Fig. [22] between 
result (II) and (III) due to different fourier weights is 
that: In (II), the background forms a large cluster and 
the zebra is composed of lots of small clusters. In (III), 
the zebra forms a large single community while the back- 
ground is composed of many small communities. For the 
images in Fig. [12j we substitute in Eq. ([2]), the weights 
of Eq. ([TT]) along with a negative background V. 
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